Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Text/Graphics Separation in Maps

Identifieur interne : 001846 ( Main/Exploration ); précédent : 001845; suivant : 001847

Text/Graphics Separation in Maps

Auteurs : Ruini Cao [Singapour] ; Lim Tan [Singapour]

Source :

RBID : ISTEX:2806202E744FA13270D3CC536B7030DC32E25B0A

Abstract

Abstract: The separation of overlapping text and graphics is a challenging problem in document image analysis. This paper proposes a specific method of detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.

Url:
DOI: 10.1007/3-540-45868-9_14


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Text/Graphics Separation in Maps</title>
<author>
<name sortKey="Cao, Ruini" sort="Cao, Ruini" uniqKey="Cao R" first="Ruini" last="Cao">Ruini Cao</name>
</author>
<author>
<name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:2806202E744FA13270D3CC536B7030DC32E25B0A</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45868-9_14</idno>
<idno type="url">https://api.istex.fr/document/2806202E744FA13270D3CC536B7030DC32E25B0A/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001203</idno>
<idno type="wicri:Area/Istex/Curation">001130</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F60</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Cao R:text:graphics:separation</idno>
<idno type="wicri:Area/Main/Merge">001926</idno>
<idno type="wicri:Area/Main/Curation">001846</idno>
<idno type="wicri:Area/Main/Exploration">001846</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Text/Graphics Separation in Maps</title>
<author>
<name sortKey="Cao, Ruini" sort="Cao, Ruini" uniqKey="Cao R" first="Ruini" last="Cao">Ruini Cao</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>School of Computing, National University of Singapore, 3 Science Drive 2, 117543</wicri:regionArea>
<orgName type="university">Université nationale de Singapour</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Singapour</country>
</affiliation>
</author>
<author>
<name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Singapour</country>
<wicri:regionArea>School of Computing, National University of Singapore, 3 Science Drive 2, 117543</wicri:regionArea>
<orgName type="university">Université nationale de Singapour</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Singapour</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">2806202E744FA13270D3CC536B7030DC32E25B0A</idno>
<idno type="DOI">10.1007/3-540-45868-9_14</idno>
<idno type="ChapterID">14</idno>
<idno type="ChapterID">Chap14</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: The separation of overlapping text and graphics is a challenging problem in document image analysis. This paper proposes a specific method of detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Singapour</li>
</country>
<orgName>
<li>Université nationale de Singapour</li>
</orgName>
</list>
<tree>
<country name="Singapour">
<noRegion>
<name sortKey="Cao, Ruini" sort="Cao, Ruini" uniqKey="Cao R" first="Ruini" last="Cao">Ruini Cao</name>
</noRegion>
<name sortKey="Cao, Ruini" sort="Cao, Ruini" uniqKey="Cao R" first="Ruini" last="Cao">Ruini Cao</name>
<name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
<name sortKey="Tan, Lim" sort="Tan, Lim" uniqKey="Tan L" first="Lim" last="Tan">Lim Tan</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001846 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001846 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:2806202E744FA13270D3CC536B7030DC32E25B0A
   |texte=   Text/Graphics Separation in Maps
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024